Disclaimer: This report has been written for the authors learning purposes only.
To inform the planning and provision of cancer treatment services, analyse breast cancer incidence data from the NHS Borders health board. Findings should be presented in a report of 1-2 pages, highlighting any significant insights and identified trends.
Between 1997-2021, incidences of breast cancer in males made up less than 1% of total breast cancer incidences. This report will therefor focus on instances of breast cancer among females.
cancer_incidence_borders %>%
filter(cancer_site != "All Cancer Types",
sex == "All") %>%
group_by(cancer_site) %>%
summarise(total_incidences = sum(incidences_all_ages)) %>%
arrange(desc(total_incidences)) %>%
head(3) %>%
gt() %>%
tab_header(title = "Total Cancer Incidences") %>%
cols_label(
cancer_site = "Cancer Site",
total_incidences = "Total Incidences") %>%
tab_options(column_labels.font.weight = "bold",
table.align = "left")
| Total Cancer Incidences | |
| Cancer Site | Total Incidences |
|---|---|
| Non-Melanoma Skin Cancer | 6174 |
| Basal Cell Carcinoma Of The Skin | 4049 |
| Breast | 2614 |
cancer_incidence_borders %>%
filter(cancer_site == "Breast",
sex != "All") %>%
group_by(sex) %>%
summarise(total_incidences = sum(incidences_all_ages)) %>%
arrange(desc(total_incidences)) %>%
head(3) %>%
gt() %>%
tab_header(title = "Breast Cancer Incidences by Sex") %>%
cols_label(
sex = "Sex",
total_incidences = "Total Incidences") %>%
tab_options(column_labels.font.weight = 'bold',
table.align = "left")
| Breast Cancer Incidences by Sex | |
| Sex | Total Incidences |
|---|---|
| Females | 2598 |
| Male | 16 |
According to NHS Borders data, breast cancer among females has the highest number of incidences, highest mean crude rate and highest mean European age-standardised rate (EASR) of any cancer type. See below for a more detail on how the EASR is calculated.
cancer_incidence_borders %>%
filter(cancer_site != "All Cancer Types",
sex == "Females") %>%
group_by(cancer_site) %>%
summarise(total_incidences = sum(incidences_all_ages)) %>%
arrange(desc(total_incidences)) %>%
head(3) %>%
gt() %>%
cols_label(
cancer_site = "Cancer Site",
total_incidences = "Total Incidences") %>%
tab_options(column_labels.font.weight = 'bold')
| Cancer Site | Total Incidences |
|---|---|
| Breast | 2598 |
| Non-Melanoma Skin Cancer | 2519 |
| Basal Cell Carcinoma Of The Skin | 1882 |
cancer_incidence_borders %>%
filter(cancer_site != "All Cancer Types",
sex == "Females") %>%
group_by(cancer_site) %>%
summarise(mean_easr = mean(easr)) %>%
arrange(desc(mean_easr)) %>%
head(3) %>%
gt() %>%
cols_label(
cancer_site = "Cancer Site",
mean_easr = "Mean EASR") %>%
tab_options(column_labels.font.weight = 'bold')
| Cancer Site | Mean EASR |
|---|---|
| Breast | 161.3640 |
| Non-Melanoma Skin Cancer | 150.3996 |
| Basal Cell Carcinoma Of The Skin | 113.9178 |
To understand how these rates compare to other health boards in Scotland, we can visualise a five year summary of the EASR - which is a “weighted mean of the age-specific rates where the weights are taken from the population distribution of a standard population; the ASR is expressed per 100,000.” (European Commission, 2023)
geo_summary %>%
ggplot(aes(fill = easr)) +
geom_sf(colour = "white", linewidth = 0.04) +
labs(
title = "Female Breast Cancer EASR (2017-2021)",
subtitle = "By NHS Health Board",
fill = "EASR") +
scale_fill_distiller(palette = "Blues", direction = +1) +
theme(panel.background = element_rect(fill = "white"),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
rect = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank())
NB - Unfortunately data for the individual health boards NHS Western Isles, NHS Shetland and NHS Orkney was not available at the time of report completion.
five_year_summary %>%
select(hb, cancer_site, sex, year, easr) %>%
filter(sex == "Females",
cancer_site == "Breast",
hb != "GR0800001") %>%
left_join(geography_codes, "hb") %>%
select(hb_name, year, sex, cancer_site, easr) %>%
arrange(desc(easr)) %>%
gt() %>%
cols_label(
hb_name = "NHS Health Board",
year = "Year(s)",
sex = "Sex",
cancer_site = "Cancer Site",
easr = "EASR") %>%
tab_options(column_labels.font.weight = 'bold') %>%
data_color(columns = easr, palette = "Blues")
| NHS Health Board | Year(s) | Sex | Cancer Site | EASR |
|---|---|---|---|---|
| NHS Dumfries and Galloway | 2017-2021 | Females | Breast | 174.6153 |
| NHS Lothian | 2017-2021 | Females | Breast | 172.3179 |
| NHS Forth Valley | 2017-2021 | Females | Breast | 171.6585 |
| NHS Lanarkshire | 2017-2021 | Females | Breast | 169.1486 |
| NHS Greater Glasgow and Clyde | 2017-2021 | Females | Breast | 168.8007 |
| NHS Borders | 2017-2021 | Females | Breast | 164.8136 |
| NHS Fife | 2017-2021 | Females | Breast | 164.4207 |
| NHS Tayside | 2017-2021 | Females | Breast | 163.4222 |
| NHS Highland | 2017-2021 | Females | Breast | 162.5039 |
| NHS Ayrshire and Arran | 2017-2021 | Females | Breast | 157.0019 |
| NHS Grampian | 2017-2021 | Females | Breast | 156.2987 |
[fig. 1]
fig1_plot <- cancer_incidence_borders %>%
filter(sex == "Females",
cancer_site %in% c("All Cancer Types", "Breast")) %>%
ggplot() +
geom_line(aes(x = year, y = incidences_all_ages, colour = cancer_site, group = 1,
text = paste0("<b>Year:</b> ", year, "<br>",
"<b>Type:</b> ", cancer_site, "<br>",
"<b>Incidences:</b> ", incidences_all_ages)),
size = 1.5) +
scale_x_continuous(breaks = c(1997:2021)) +
scale_colour_manual(values = colour_scheme, labels = c("All Combined", "Breast")) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5)) +
ylim(0, 500) +
labs(
x = "\n Year",
y = "Incidences\n",
title = "Female Cancer Incidences",
colour = "Cancer Type:") +
theme(panel.background = element_rect(fill = "white"),
panel.grid = element_line(colour = "grey90"))
ggplotly(fig1_plot, tooltip = "text") %>%
layout(hovermode = "x unified",
title = list(text = paste0("<b>Female Cancer Incidences</b>",
"<br>",
"<sup>",
"NHS Borders: 1997-2021",
"</sup>")))
What does this visualisation tell us?
When we look at the year-on-year percentage changes in breast cancer incidences we can gain further insights. The table below shows:
Why might there be a 3 year trend?
Women who meet screening criteria are invited for breast screening once every 3 years (NHS National Services Scotland, 2022).
Why might we not see the same peak in 2020 as we may have expected?
Due to the COVID-19 pandemic, no invites to breast screenings were sent between 30 March 2020 and 3 August 2020 (Public Health Scotland, 2022).
cancer_incidence_borders %>%
filter(sex == "Females",
cancer_site == "Breast") %>%
select(year, sex, cancer_site, incidences_all_ages) %>%
mutate(yearly_pct_change = round((incidences_all_ages - lag(incidences_all_ages)) / lag(incidences_all_ages) * 100)) %>%
gt() %>%
cols_label(
year = "Year",
sex = "Sex",
cancer_site = "Cancer Site",
incidences_all_ages = "No. of Incidences",
yearly_pct_change = "% Change from Previous Year") %>%
tab_options(column_labels.font.weight = 'bold') %>%
gt_highlight_rows(rows = c(3, 6, 9, 12, 15, 18, 21, 24),
bold_target_only = TRUE,
target_col = yearly_pct_change)
| Year | Sex | Cancer Site | No. of Incidences | % Change from Previous Year |
|---|---|---|---|---|
| 1997 | Females | Breast | 71 | NA |
| 1998 | Females | Breast | 69 | -3 |
| 1999 | Females | Breast | 133 | 93 |
| 2000 | Females | Breast | 69 | -48 |
| 2001 | Females | Breast | 81 | 17 |
| 2002 | Females | Breast | 131 | 62 |
| 2003 | Females | Breast | 67 | -49 |
| 2004 | Females | Breast | 62 | -7 |
| 2005 | Females | Breast | 179 | 189 |
| 2006 | Females | Breast | 68 | -62 |
| 2007 | Females | Breast | 55 | -19 |
| 2008 | Females | Breast | 154 | 180 |
| 2009 | Females | Breast | 94 | -39 |
| 2010 | Females | Breast | 86 | -9 |
| 2011 | Females | Breast | 157 | 83 |
| 2012 | Females | Breast | 103 | -34 |
| 2013 | Females | Breast | 114 | 11 |
| 2014 | Females | Breast | 130 | 14 |
| 2015 | Females | Breast | 90 | -31 |
| 2016 | Females | Breast | 98 | 9 |
| 2017 | Females | Breast | 136 | 39 |
| 2018 | Females | Breast | 97 | -29 |
| 2019 | Females | Breast | 98 | 1 |
| 2020 | Females | Breast | 107 | 9 |
| 2021 | Females | Breast | 149 | 39 |
Question: Is the greater number of female breast cancer incidences in “peak years” (1999, 2002, 2005, 2008, 2011, 2014, 2017) compared to “non-peak years” (1997, 1998, 2000, 2001, 2003, 2006, 2007, 2009, 2010, 2012, 2013, 2015, 2016, 2018, 2019) statistically significant?
cancer_incidence_borders_sample <- cancer_incidence_borders %>%
filter(sex == "Females", cancer_site == "Breast") %>%
select(id, cancer_site, sex, year, incidences_all_ages) %>%
mutate(peak = case_when(
year == 1999 ~ "peak",
year == 2002 ~ "peak",
year == 2005 ~ "peak",
year == 2008 ~ "peak",
year == 2011 ~ "peak",
year == 2014 ~ "peak",
year == 2017 ~ "peak",
TRUE ~ "standard"
)
)
observed_stat <- cancer_incidence_borders_sample %>%
specify(incidences_all_ages ~ peak) %>%
calculate(stat = "diff in means", order = c("peak", "standard"))
null_distribution <- cancer_incidence_borders_sample %>%
specify(response = incidences_all_ages, explanatory = peak) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
calculate(stat = "diff in means", order = c("peak", "standard"))
p_value <- null_distribution %>%
get_p_value(obs_stat = observed_stat, direction = "right")
Test Used: Two Sample Mean Test (Independent)
Significance Level: 0.05
H0: \(\mu{1}\) -
\(\mu{2}\) = 0
H1:
\(\mu{1}\) - \(\mu{2}\) > 0
Result: Based on a bootstrapped NULL distribution, a very low p-value which is less than 0.05 is returned. We therefor reject H0 in favor of H1 with evidence suggesting that there is a statistically significant increase in the mean number of female breast cancer incidences in “peak years”.
[fig. 2]
fig2_plot <- five_year_summary_long %>%
filter(cancer_site == "Breast",
sex == "Females") %>%
ggplot() +
geom_col(aes(x = age, y = incidences,
text = paste0("<b>Age:</b> ", age, "<br>", "<b>Incidences:</b> ", incidences, "<br>")),
fill = "#0391BF") +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5)) +
labs(
x = "\n Age",
y = "Incidences\n",
title = "Total Female Breast Cancer Incidences by Age") +
theme(panel.background = element_rect(fill = "white"),
panel.grid = element_line(colour = "grey90"))
ggplotly(fig2_plot, tooltip = "text") %>%
layout(title = list(text = paste0("<b>Total Female Breast Cancer Incidences by Age</b>",
"<br>",
"<sup>",
"NHS Borders: 1997-2021",
"</sup>")))
What does this visualisation tell us?
Why might these age groups see increased incidence numbers?
NHS Borders Population Projections:
Females 50+ 2021: 29889
Females 50+ 2041: 31148 (4.21225% increase)
(National Records of Scotland, 2023)
Screening data should be reviewed to establish if the resulting back-log from COVID-19 has been cleared in order to establish whether a further increase in incidences should be anticipated in 2022.
Resources should be allocated according to the observed trend of increased incidences every three years
Research/Analysis should be conducted to further understand and confirm any reason for this trend, including any links to screening schedules.
Research/Analysis should be conducted to establish whether increased incidence with age is in any way the result of current screening criteria and if therefor screening criteria should be widened.
Long term service planning should take into consideration the ~4% projected population increase of the female 50-70 demographic in NHS Borders, as projected by the National Records of Scotland.
SpatialData.gov.scot Metadata Portal: NHS Scotland Health Boards https://spatialdata.gov.scot/geonetwork/srv/api/records/f12c3826-4b4b-40e6-bf4f-77b9ed01dc14
Public Health Scotland: 5 Year Summary of Incidence by Health Board https://www.opendata.nhs.scot/dataset/annual-cancer-incidence/resource/e8d33b2b-1fb2-4d59-ad21-20fa2f76d9d5
Public Health Scotland: Geography Codes and Labels https://www.opendata.nhs.scot/dataset/geography-codes-and-labels
Public Health Scotland: Incidence by Health Board https://www.opendata.nhs.scot/dataset/annual-cancer-incidence/resource/3aef16b7-8af6-4ce0-a90b-8a29d6870014
European Commission, 2023 https://ecis.jrc.ec.europa.eu/info/glossary.html
NHS National Services Scotland, 2022: https://www.nss.nhs.scot/specialist-healthcare/screening-programmes/breast-screening/
National Records of Scotland, 2023: https://www.nrscotland.gov.uk/statistics-and-data/statistics/statistics-by-theme/population/population-projections/sub-national-population-projections/2018-based/detailed-datasets
Public Health Scotland, 2022: https://www.publichealthscotland.scot/media/12843/2022-04-26_breast_screening_report.pdf